Pharmacogenomics polygenic risk score for drug response prediction using PRS-PGx methods

Polygenic risk scores (PRS) have been successfully developed for the prediction of human diseases and complex traits in the past years. For drug response prediction in randomized clinical trials, a common practice is to apply PRS built from a disease genome-wide association study (GWAS) directly to a corresponding pharmacogenomics (PGx) setting. Here, we show that such an approach relies on stringent assumptions about the prognostic and predictive effects of the selected genetic variants. We propose a shift from disease PRS to PGx PRS approaches by simultaneously modeling both the prognostic and predictive effects and further make this shift possible by developing a series of PRS-PGx methods, including a novel Bayesian regression approach (PRS-PGx-Bayes). Simulation studies show that PRS-PGx methods generally outperform the disease PRS methods and PRS-PGx-Bayes is superior to all other PRS-PGx methods. We further apply the PRS-PGx methods to PGx GWAS data from a large cardiovascular randomized clinical trial (IMPROVE-IT) to predict treatment related LDL cholesterol reduction. The results demonstrate substantial improvement of PRS-PGx-Bayes in both prediction accuracy and the capability of capturing the treatment-specific predictive effects while compared with the disease PRS approaches.

Zhai and colleagues present methods that use GWAS summary statistics and individual-level data from pharmacogenomics settings to build polygenic scores (PGS) with both good prognostic and predictive values. They compare their method with traditional PGS methods that use GWAS summary statistics only. I am not an expert in RCT, but I imagine this could have a major impact as a better use of PGS in RCTs. However, I have several comments and questions I would like the authors to address before I could recommend this for publication in Nature Communications. Overall, I fear that the size of the data used and the implementation of the new methods are not convincing enough.
Major comments (in no particular order): -I am not sure I understand what are the prognostic effects (betas) in the model used; when we adjust for the baseline level Y0 in the model, what would be the effects of the genetics on this change (i.e. Y1 -Y0 or Y1 / Y0)? -PRS-PGx-Bayes seems to be somewhat an extension of the PRS-CS method (using very similar Bayesian setting and tricks like regularizing phi x psi <= 1 that are not really mentioned as borrowed from there); can this really be considered a novel method? -Implementations of all methods presented here rely on very crude R code, e.g. calling cor() on the whole genotype matrix (i.e. computing and storing the full dense LD matrix); this seems highly inefficient and would never scale to the current data sizes used in summary statistics (e.g. even when restricting to e.g. 1M HapMap3 variants). -On the same note, the correlation/LD matrix between variants can often be considered as sparse (banded or even block-diagonal). This is one of the main reasons PGS methods can be scalable to e.g. using one million variants. However, when computing cor(Gi x T, Gj x T), even if Sigma_i,j = 0, cor(Gi x T, Gj x T) would be non-zero, therefore having to store and use a full LD matrix again. And LD blocks would not be valid anymore.
-The supplementary methods seem to suggest that LDpred2-auto is used here, while the code (https://github.com/cran/PRSPGx/blob/master/R/PRS_Dis_LDpred2.R) seems to suggest that fixed values for h2 and p are used instead? Also a full dense correlation matrix is used? -Are the PRS-PGx methods trained from ~3400 (=5661 * 3/5) individuals in the real data analysis? I am very surprised you can get such good predictions from such a small sample size.
-How many variants are used in the real data analysis? 9,407,967 variants are mentioned in the methods, but as mentioned before, that is not possible given the expected poor scalability of the developed methods.
Minor comments: -The name of the methods (PRS-PGx) is slightly unfortunate given that the main application is to a continuous trait (not a disease).
-Maybe it could be made clearer that the penalized regression methods are actually individual-level data methods. How can these be worse than the C+T method? -The code used at https://github.com/cran/PRSPGx/blob/master/R/PRS_PGx_Bayes.R#L69 seems to be assuming that cor(Gi x T, Gj) = cor(Gi, Gj x T), which does not seem to be true. -I don't know if it is very often the case in RCTs that mu_T = 0.5, but if so, would it be interesting to also derive equations for this special case? -I find it odd to denote a covariance (e.g. cov(Gi, Gj)) by sigma^2 since it can have a negative value.
-To be sure I understand the code of PRS-PGx-CT, the clumping is performed on the cor(Gi, Gj) while the thresholding is performed on the p-values from the model with the interaction term, and then the same SNPs are used for keeping variants in both types of effects? -Is it really useful to talk about the special case of mu_T = 0? -Could be reworded or corrected: + "its genotype main and genotype-by-treatment interaction effects" + "for the simplicity" + "an probability" + for writing expectations, maybe use "E[X]" instead of "EX" + G ~ Binomial(2; MAF) -> G ~ Binomial(2; AF) + define "GEI" Reviewer #2 (Remarks to the Author): Zhai et al. propose a method to model PRS for PGx studies by simultaneously modeling both the prognostic and predictive effects. They perform simulation studies and show that their newly developed PRS-PGx methods generally outperform the disease PRS methods with PRS-PGx-Bayes being superior. Application of PRS-PGx methods to a large cardiovascular RCT (IMPROVE-IT) to predict treatment related LDL cholesterol showed noticeable improvements.
This study is well-written and provides good data in both simulations and applied to a cardiovascular RCT for improved PRS methods for drug response prediction in PGx studies. These methods are needed, as the authors noted, most PGX PRS studies to date have relied on standard disease PRS methods which relies on stringent assumptions that are not necessarily correct and are restricted in terms of prediction potential and can result in limited power in prediction. By considering a PGx PRS that not only is built from variants with prognostic effects but also predictive effects related to treatment, an optimized PRS can be built to study treatment interaction effects for PRS in PGx studies.
I have a few comments and suggestions: 1. There has been intense focus on extending disease PRS to multi-ethnic populations. Studies have shown that PRS trained on European GWAS data have limited portability and reduced prediction in Non-European populations. The issue of multi-ethnic populations is absent in this study -how do these PGx PRS studies function with respect to trans-ancestry populations? 2. Although I understand this is primarily a methods paper demonstrating better prediction performance of PGx PRS methods for PGx studies, there is hardly any empirical results shown on the PGx PRS-Bayes method applied to the IMPROVE-IT study. Only r2 and P-values are shown but it would be informative to show effect sizes and association results of PGx PRS-Bayes for the different arms of the IMPROVE-IT study.
We would like to thank two reviewers for their critical review of our PRS-PGx manuscript. We appreciate their comments and questions, which led to a substantial improvement of the paper. Detailed responses to reviewers are given below. We highlighted the changes related to the comments in the revision.

Response to Reviewer 1
Zhai and colleagues present methods that use GWAS summary statistics and individual-level data from pharmacogenomics settings to build polygenic scores (PGS) with both good prognostic and predictive values. They compare their method with traditional PGS methods that use GWAS summary statistics only. I am not an expert in RCT, but I imagine this could have a major impact as a better use of PGS in RCTs. However, I have several comments and questions I would like the authors to address before I could recommend this for publication in Nature Communications. Overall, I fear that the size of the data used and the implementation of the new methods are not convincing enough.
Response: Thank you very much for the high-level summary of our research work in this manuscript. Please find below the detailed point-by-point response to each of the comments and we hope that we have sufficiently addressed the major concern that "the size of the data used and the implementation of the new methods are not convincing enough".
Response: Thank you for your questions. As shown in the "Results" section, equation (1) Y = Xγ +β T T +Gβ +(G×T )α+ , β is a m×1 vector of prognostic effects (i.e., genotype main effects), α is a m×1 vector of predictive effects (i.e., genotype by treatment interaction effects). The prognostic effects β represent the association strength between the genotype and the drug response Y regardless of the treatment T . In contrast, the predictive effects α represent the additional association strength between the genotype and the drug response in the treatment arm (since we coded treatment arm as 1 and control arm as 0 in the model).
When we adjust for the baseline level Y 0 in the model, the genetic effect (β or α) on the change should be interpreted as the effects in addition to the baseline's (Y 0 's) effect on the change. One of the major reasons that we adjust for Y 0 in the model is that the type I error of the baseline unadjusted model could be inflated for those variants associated with Y 0 , since the Y 0 is typically highly correlated with the change. In other words, the type I error inflation happens when baseline value Y 0 becomes a mediator between the variants and the change from baseline. This is the main conclusion from our another manuscript "A statistical perspective on baseline adjustment in pharmacogenomic genome-wide association studies of quantitative change" which has been recently accepted by npj Genomic Medicine.

References:
[1] Zhang, H., Chhibber, A., Shaw, P.M., Mehrotra, D.V. & Shen, J. A statistical perspective on baseline adjustment in pharmacogenomic genome-wide association studies of quantitative change. npj Genomic Medicine (2022). Accepted. # 2 PRS-PGx-Bayes seems to be somewhat an extension of the PRS-CS method (using very similar Bayesian setting and tricks like regularizing phi × psi ≤ 1 that are not really mentioned as borrowed from there); can this really be considered a novel method?
Response: Thanks for the question. We agree with the reviewer that our proposed PRS-PGx-Bayes method shared the same Bayesian regression framework with the PRS-CS method. We have already referred to Ge et al. (2019) when describing the details of the methods in the "Methods: PRS-PGx-Bayes" section. To further avoid potential confusion, we revised the following sentences in the "Introduction" section: "... Moreover, by extending the idea of global-local scaling parameters from disease GWAS (Ge et al., 2019) to PGx GWAS, the PRS-PGx-Bayes method is able to infer the posterior prognostic and predictive effects simultaneously ..." In the "Methods: PRS-PGx-Bayes" section, we revised the following sentences: "... Also as proposed by Ge et al. (2019), to avoid numerical issues caused by collinearity between SNPs, we set φψ j ≤ ρ and φξ j ≤ ρ ..." Although the proposed PRS-PGx-Bayes method could be considered as an extension of the PRS-CS method from disease PRS space to PGx PRS space, the PRS-PGx-Bayes method we proposed has several unique features. Below we briefly summarize the novelties of the PRS-PGx-Bayes method in terms of the concept, the model, and the technicality.
Concept. The PRS-PGx-Bayes method (and other PRS-PGx methods) constructs both prognostic score and predictive score by incorporating the genetic main effect and the gene-bytreatment effect simultaneously to improve drug response prediction. To our best knowledge, no existing methods including the PRS-CS method can be directly applied to jointly model both prognostic and predictive effects for drug response prediction. The necessity of PGx-PRS approaches is validated by the proof of extremely stringent assumptions needed for the PRS-Dis approach to predict drug response. We believe that our theoretical framework and the concept of jointly modelling both prognostic and predictive effects for the construction of PGx PRS in PGx studies are novel in the field.
Model. To implement the concept of joint modelling of main and interaction effects, we extended the Bayesian model used in the PRS-CS method to accommodate both effects and their correlation in the PRS-PGx-Bayes method. Therefore, a model of scalar β becomes a model of vector (β, α). A single variance parameter of β becomes a variance-covariance matrix of (β, α). A joint distribution of prognostic and predictive effects is needed when considering the posterior instead of a univariate distribution. We believe that this systematic extension provides more appropriate methods and tools for polygenic score application in PGx studies.
Technicality. In terms of the implementation details, compared with the PRS-CS method, the PRS-PGx-Bayes method proposed to use the hierarchical half-t prior (Huang & Wand, 2013) to model the variance-covariance matrix of a multivariate normal distribution. Although the inverse Wishart (IW) prior may seem to be a more natural extension of Ge et al. (2019) and a more common practice when modelling covariance matrix, the IW prior imposes a dependency between the correlations and the variances: larger variances automatically imply the absolute value of the correlation near one while small variances imply the correlation near zero (Tokuda et al., 2011;Alvarez et al., 2014). We believe that the adoption of the hierarchical half-t prior in our PRS-PGx-Bayes method provides a better approach for modeling the variance-covariance matrix of a multivariate normal distribution in PGx applications.
# 3 Implementations of all methods presented here rely on very crude R code, e.g. calling cor() on the whole genotype matrix (i.e. computing and storing the full dense LD matrix); this seems highly inefficient and would never scale to the current data sizes used in summary statistics (e.g. even when restricting to e.g. 1M HapMap3 variants).
Response: Thank you for the comment. To avoid confusion, we would like to point out that our algorithm including cor() was applied on LD blocks (Berisa & Pickrell, 2016) in both simulations and real data analysis by recognizing that it is impossible to call cor() on the whole genome or chromosomes. The LD blocks were defined in Berisa & Pickrell (2016) and were publicly available at https://bitbucket.org/nygcresearch/ldetect-data. The same "by LD block" strategy was also adopted by Ge et al. (2019). More detailed justifications about this strategy are given in our response to Comment # 4.
To directly address reviewer's concern on the computational costs of cor(), we conducted additional simulations by applying cor() on the largest LD block (chr 6, LD block 33). The simulation results were presented in Response Figure 1. After matching IMPROVE-IT PGx data to 1000 Genomes (1KG) data, we had 11,769 SNPs left in this block. We then evaluated the computational time and memory by randomly choosing 1,000, 3,000, 5,000, 7,000, 9,000 and the full 11,769 SNPs from the block. As expected, the computational time and memory increased at the rate of m 2 , where m denotes the number of variants. But even for the largest LD block, calling cor() took around 3 minutes and 1GB memory, which is quite reasonable.
To further clarify the computational burden, in the Supplemental materials, we added the distribution of block sizes (i.e., the number of SNPs in each LD block) as the Supplementary Figure 12. In European population, there are 1,725 independent LD blocks across the whole genome. The largest LD block in the IMPROVE-IT PGx data is on chromosome 6 with 18,279 SNPs; after matching to 1KG,11,769 SNPs are left to be handled simultaneously.
In the "Methods: PRS-PGx-Bayes" section, we revised the following paragraph: "... In addition, we partition the genome into 1,725 largely independent genomic regions (Berisa & Pickrell, 2016) estimated using data from the 1000 Genomes European sample, and further conduct multivariate update of the effect sizes within each LD block. The distributions of block sizes in 1KG and IMPROVE-IT data are summarized in Supplementary Figure  We also conducted additional simulation to evaluate the total computational time of PRS-PGx-Bayes function using the largest LD block (i.e., chr 6, LD block 33). The results were presented in Supplementary Figure 9. The total computational time were assessed using PRS-PGx-Bayes function with 1,000 MCMC iterations, which also increased at the rate of m 2 to m 3 , where m denotes the number of variants. It took 43.5 hours for the largest LD block and 3.4 hours for the median LD block. It is worth mentioning that the computation in each LD block is independent, meaning we can easily distribute the computation of 1,725 LD blocks under the High Performance Computing environment to reduce the run time. In the authors' working environment, the IMPROVE-IT whole genome analysis took about 40 hours, where typically 200 -400 jobs were run simultaneously.
Supplementary Figure 9: Computational time on the largest LD block (chr 6, block 33) by running PRS-PGx-Bayes function with 1,000 MCMC iterations. Number of variants = 1,000, 3,000, 5,000, 7,000, 9,000, and 11,769 (whole LD block). The real genetic data was obtained from the IMPROVE-IT trial with the sample size of 5,661. The effect sizes and phenotype data were simulated with heritability fixed at 0.3, ψ/ξ = 1, and P(causal) = 0.01.
In the "Results: Simulation studies" section, we added a subsection of "Computational time": "... To assess the computational burden of the proposed method, we applied the PRS-PGx-Bayes function with 1,000 MCMC iterations to chromosome 6, LD block 33 (the largest LD block with 11,769 SNPs). As a sensitivity analysis, we also explored scenarios by randomly choosing 1,000, 3,000, 5,000, 7,000, 9,000 SNPs from that block. The real genetic data was obtained from the IMPROVE-IT trial with a sample size of 5,661. The effect sizes and phenotype data were simulated with heritability fixed at 0.3, ψ/ξ = 1, and P(causal) = 0.01. The tuning parameters were selected via cross-validation. The computation was completed on a single core of 2.4 GHz Intel Core i5. We summarized the result in Supplementary Figure 9, which shows that the computational time increased at the rate of m 2 to m 3 , where m denotes the number of variants. The result also shows that it took roughly 43.5 hours for the largest LD block and 3.4 hours for the median-size LD block (Supplementary Figure 12) to complete the computation. In practice, since the computation in each LD block is independent, we could further shorten the computational time by parallel computing 1,725 LD blocks across the whole genome. In the authors' High Performance Computing working environment, the IMPROVE-IT whole genome analysis took about 40 hours, where typically 200 -400 jobs were run simultaneously. ..." References:  # 4 On the same note, the correlation/LD matrix between variants can often be considered as sparse (banded or even block-diagonal). This is one of the main reasons PGS methods can be scalable to e.g. using one million variants. However, when computing cor(Gi × T, Gj × T), even if Sigma_i,j = 0, cor(Gi × T, Gj × T) would be non-zero, therefore having to store and use a full LD matrix again. And LD blocks would not be valid anymore.
Response: Thanks for the insightful comment. We agree with the reviewer that even if SNP i and SNP j are independent, cor(G i × T, G j × T ) would still be non-zero. In fact, we can prove that under typical 1:1 treatment-placebo allocation ratio in randomized clinical trials (please see the our proof added to Supplementary Method D; also see response to Comment # 11 for details), where f i , f j are the MAFs of SNP i and j, respectively. When the treatment-placebo allocation ratio is not 1:1, in addition to MAFs, only the percentage of treatment patients (p) is needed to determine cor(G i × T, G j × T ). Therefore, if needed, we don't have to store a full LD matrix to track values of cor(G i × T, G j × T ). Instead, we only need to record a vector of f i 's and a scalar p.
On the validity of the LD blocks, we want to point out that the "by LD block" implementation strategy was the choice we made to balance the computational costs and the prediction accuracy. We believe that the accuracy loss from ignoring cor(G i × T , G j × T ) is acceptable in exchange for feasible computation time in the whole genome data analysis.
To validate this point, we performed additional simulation studies using block 31 and block 32 on chromosome 19, following the same simulation settings in the main text. The PRS-PGx-Bayes function was applied on these blocks separately (i.e., block-by-block) and jointly (i.e., full LD matrix was used). We also applied PRS-PGx-Bayes method to the uniform blocks with size = 200, 500, and 2,500 (i.e., the number of variants in each block = 200, 500, and 2,500). Similarly, as Fig. 2 in the main text, the heritability was fixed at 0.3 and ψ/ξ = 1. The numbers of the causal variants for P(causal) = 0.001, 0.01 and 0.1 were 5, 50 and 500, respectively. The training sample size was 3,000. The tuning parameters were selected via cross-validation in the training data. Performance comparisons of PRS-PGx-Bayes method with different implementation strategies were summarized and added to Supplementary Figure 4.
Overall, there is indeed a slightly decreasing trend in R 2 and -log(p-value) from using the full genotype matrix to using uniform blocks with size 200. However, such differences across different types of blocks are limited, especially between LD blocks and full genotype matrix. For example, when p(causal) = 0.001, compared to full genotype matrix (orange bar), the LD block approach (red bar) decreases R 2 by 0.4%. We also summarized the relative decreases of LD block approach compared to the full genotype approach in Supplementary Table 1. The table shows that the relative R 2 decreases of the LD block approach in simulation studies are very little (i.e., all ≤ 1.1%).
Supplementary Figure 4: Performance comparisons of PRS-PGx-Bayes method when carried out by different implementation strategies (based on LD blocks 31 and 32 on chromosome 19) in simulation studies, where heritability was fixed at 0.3, training sample size = 3,000, and ψ/ξ = 1. PRS-PGx-Bayes was carried out by the uniform block with size 200, 500, and 2,500, the LD block, and the full genotype data. The numbers of the causal variants for P(causal) = 0.001, 0.01 and 0.1 were 5, 50 and 500, respectively. The tuning parameters of PRS-PGx-Bayes were selected via cross-validation in the training data. The error bar represents the standard error of the results calculated from the testing sets. The performance was assessed in the testing set in terms of (a) prediction accuracy R 2 of S P Gx in two arms, (b) p-value for the predictive effect of S pred × T interaction, (c) R 2 of S P Gx under treatment arm, and (d) R 2 of S P Gx under control arm.
Supplementary Table 1: Performance decreases of PRS-PGx-Bayes method carried out from using full genotype matrix (i.e., using LD blocks 31 and 32 jointly on chromosome 19) to using LD blocks (i.e., using LD blocks 31 and 32 separately on chromosome 19) in simulation studies. Heritability was fixed at 0.3, training sample size = 3,000, and ψ/ξ = 1. The numbers of the causal variants for P(causal) = 0.001, 0.01 and 0.1 were 5, 50 and 500, respectively. The performance was assessed in terms of prediction accuracy R 2 of S P Gx in two arms, p-value for the predictive effect of S pred × T interaction, R 2 of S P Gx under treatment arm, and R 2 of S P Gx under control arm.
To summarize what we just discussed, in the "Results: Simulation studies" section, we added the following paragraph: "... To further assess the impact of different implementation strategies, we also performed additional simulations where PRS-PGx-Bayes function was applied on LD blocks jointly (i.e., full LD matrix across LD blocks was used). The simulation settings remained the same as described in Fig. 2. As a sensitivity analysis, we also applied PRs-PGx-Bayes method to the uniform blocks with number of variants in each block as 200, 500, and 2,500, respectively. Supplementary Figure 4 shows that there is a slightly decreasing trend in R 2 and -log(p-value) from using the full genotype matrix to using uniform blocks with size 200. However, such differences across different types of blocks are limited, especially between LD blocks and full genotype matrix. For example, when P(causal) = 0.001, compared to using the full genotype matrix, the LD block approach only decreases R 2 by 0.4%. The relative decreases of LD block approach compared to the full genotype approach was summarized in Supplementary Table 1. The table shows that the relative decreases of the LD block approach in simulation studies are very small (i.e., all ≤ 1.1%) ..." In the "Methods: PRS-PGx-Bayes" section, we added the following sentences: "... As shown in Supplementary Figure 4 and Supplementary Table 1, this strategy is justified by that fact that only a small relative difference (≤ 1.1%) is observed when the PRS-PGx-Bayes function is carried out by LD blocks, compared to across multiple LD blocks in simulation studies ..." In the "Discussion" section, we added the following sentences: "... Lastly, we suggest applying our proposed methods by LD blocks. Expanding the size of blocks (e.g., using the whole chromosome) may slightly improve prediction accuracy but also significantly increase computational costs. On the other hand, further reducing the size of blocks (e.g., using the uniform blocks with smaller size) can reduce run-time but also possibly increase the bias by missing long-range LD. We believe that the by LD block strategy is a right trade-off between decent modeling accuracy and feasible computational time. We also acknowledge that future work is needed to further improve the computational efficiency to incorporate more SNPs for simultaneous analysis ..." References: # 5 The supplementary methods seem to suggest that LDpred2-auto is used here, while the code (https://github.com/cran/PRSPGx/blob/master/R/PRS_Dis_LDpred2.R) seems to suggest fixed values for h 2 and p are used instead? Also a full dense correlation matrix is used?
Response: Thanks for pointing out this inconsistency. We used LDpred2-grid method (Privé et al., 2020) for the implementation of PRS-Dis-LDpred2 in our manuscript. Per your suggestion, we changed the description of PRS-Dis-LDpred2 in Supplementary Method C to the following: "... As an updated version of LDpred, LDpred2-grid (Privé et al., 2020) introduces a third hyperparameter indicating whether sparsity is enabled or not. When implementing the PRS-Dis-LDpred2 method, we tested a grid of hyper-parameters with p and h 2 . We also enabled the sparsity by setting "sparse = TRUE", aimed at providing sparse effect size estimates, i.e., shrinking some effects to exactly 0 ..." The reason why we used LDpred2-grid , instead of LDpred2-auto, for our simulation studies and real data analysis was because although their performances were generally comparable, we found that LDpred2-grid slightly outperformed LDpred2-auto in many scenarios (e.g., the proportion of causal variants was small to moderate) in our additional simulations.
Specifically, as presented in the Supplementary Figure 14, we conducted additional simulation studies following the same settings described in the "Methods: Simulation" section. The heritability was fixed at 0.3 and ψ/ξ = 1. Numbers of the causal variants for P(causal) = 0.001, 0.01 and 0.1 were 5, 50 and 500, respectively. The training sample size was 20,000. The tuning parameters of LDpred2-grid were selected via cross-validation in the training data. We compared LDpred2-grid with LDpred2-auto using the same performance metrics described in Fig. 2, which were R 2 of S P Gx in the whole population, treatment arm, and control arm, respectively, and the predictive p-value of S pred × T interaction. As we could see, when P(causal) was low or moderate (i.e., P(causal) = 0.001 or 0.01), LDpred2(-grid) was slightly better than LDpred-auto; and when P(causal) was high (i.e., P(causal) = 0.1), LDpred2-auto outperformed LDpred2(-grid) a little bit. However, overall the two methods' performance was very close to each other. These observations were consistent with Privé et al. (2020).
We added Supplementary Figure 14 to the Supplemental materials document.
Supplementary Figure 14: Predictive performance of LDpred2-grid and LDpred2-auto methods in the simulation studies, where heritability was fixed at 0.3 and ψ/ξ = 1.
Numbers of the causal variants for P(causal) = 0.001, 0.01 and 0.1 were 5, 50 and 500, respectively. The training sample size was 20,000. The tuning parameters of LDpred2-grid were selected via cross-validation in the training data. The error bar represents the standard error of the results calculated from the testing sets. The performance was assessed in the testing set in terms of (a) prediction accuracy R 2 of S P Gx in two arms, (b) p-value for the predictive effect of S pred × T interaction, (c) R 2 of S P Gx under treatment arm, and (d) R 2 of S P Gx under control arm.
In Supplementary Method C, we added the following paragraph: "... Besides LDpred2-grid, Privé et al. (2020) also proposed another version called LDpred2-auto, which estimates p and h 2 within the model. This makes LDpred2-auto a method free of hyperparameters. More specifically, to estimate p in the Gibbs sampler, LDpred2 counts the number of non-zero variants as M c = j (β j = 0) ∼ Binomial(m, p); and estimates h 2 = β T Rβ, where R is the correlation matrix. To determine which method, LDpred2-grid or LDpred2-auto, should be used as PRS-Dis-LDpred2 function, we further conducted simulation studies following the same simulation method described in the main text. The heritability was fixed at 0.3 and ψ/ξ = 1.
Numbers of the causal variants for P(causal) = 0.001, 0.01 and 0.1 were 5, 50 and 500, respectively. The training sample size was 20,000. The tuning parameters of LDpred2-grid were selected via cross-validation in the training data. Supplementary Figure 14 indicated that although overall the two methods' performance was very close to each other, LDpred2-grid actually slightly outperformed LDpred2-auto in many scenarios (i.e., when the proportion of causal variants was small to moderate). Therefore, in our study, we only used LDpred2-grid as the best disease PRS method to be compared with our proposed PRS-PGx approaches ..." The PRS-Dis-LDpred2 function (a disease PRS approach) was also applied on LD blocks simultaneously (assuming independence among different LD blocks). To further avoid potential confusion, in Supplementary Method C, we added the following sentences: "... We directly use the disease PRS LDpred2 method by LD blocks as the PRS-Dis-LDpred2 method ..." References: [1] Privé, F., Arbel, J. & Vilhjálmsson, B. J. LDpred2: better, faster, stronger. Bioinformatics 36, 5424-5431 (2020).
# 6 Are the PRS-PGx methods trained from ∼ 3400 (=5661 * 3/5) individuals in the real data analysis? I am very surprised you can get such good predictions from such a small sample size.
Response: Thank you for your question and comment. In our analysis of the IMPROVE-IT GWAS data, the PRS-PGx methods were trained from ∼4,528 (= 5,661*4/5) individuals based on a 5-fold cross-validation procedure (i.e., the PGx GWAS summary statistics were obtained based on the training set of ∼4,528 individuals). The information about the CV procedure was provided in the "Methods: IMPTOVE-IT PGx GWAS data analysis" section.
Regarding the concern about getting good predictions from a small sample, we would like to point out that 1) a sample size of 5,661 might be small in disease GWAS but is actually considered large in PGx studies since the patients are from clinical trials; 2) the good prediction results probably came from the fact that drug responses typically have larger genetic effects and higher heritability. Hence, smaller (compared to disease GWAS) sample size is adequate for good prediction in PGx GWAS setting.
For example, Harper and Topol (2012) examined all of the GWAS catalogued in the US National Human Genome Research Institute at the time and showed that pharmacogenomics studies are sevenfold more likely to achieve odds-ratio (OR) > 3.0 compared to common disease GWAS in which the ORs are typically in the range of 1.05 to 1.15. Maranville and Cox (2016) also "found significantly larger effect sizes for studies focused on pharmacogenomic phenotypes, as compared to complex disease risk, morphological phenotypes, and endophenotypes." In summary, although sample sizes from PGx studies are usually smaller than those in disease GWAS, the significantly larger effect sizes from PGx studies can result in decent power to detect variants associated with drug response phenotypes and further enable good prediction performance of PRS for drug response prediction in PGx studies.
References:  # 7 How many variants are used in the real data analysis? 9,407,967 variants are mentioned in the methods, but as mentioned before, that is not possible given the expected poor scalability of the developed methods.
Response: Thanks for your question. Yes, 9,407,967 variants were used in the real data analysis. As we discussed in response to Comment # 3, our functions were applied by LD blocks (instead of by chromosome or by whole genome). The block sizes of IMPROVE-IT data range from 126 to 18,279. Please refer to the response to Comment # 3 for the block size distribution and the related computational time illustration. In summary, we believe our implementation strategy is a right trade-off between maintaining decent modeling accuracy and reducing computational costs. The total computation time for the whole genome data analysis using the PRS-PGx methods is reasonable (in the authors' working environment, the IMPROVE-IT whole genome analysis carried out by PRS-PGx-Bayes algorithm took about 40 hours, where typically 200 -400 jobs were run simultaneously).

Minor comments
# 8 The name of the methods (PRS-PGx) is slightly unfortunate given that the main application is to a continuous trait (not a disease).
Response: Thank you for your comment. We agree with the reviewer that Polygenic Risk Score (PRS) sounds more like a score for disease risk prediction, as opposed to a drug response prediction. We chose the name PRS to be consistent with the terminology used in the PRSice method and the PRS-CS method.
We would also like to point out that, although we are doing drug response prediction, it can be interpreted as the risk prediction of not having desirable outcomes, such as not responding to the treatment or elevated incidence rates of adverse events. Similar to disease phenotypes, there are usually three types of drug response phenotypes in pharmacogenomics studies: continuous (e.g., LDL change from baseline), binary (e.g., adverse drug reaction) and time-to-event (e.g., the time to cardiovascular death).
Although we mainly implement the PRS-PGx methods in the analysis of continuous drug response phenotypes, all of the PRS-PGx methods, except for PRS-PGx-Bayes, can be directly applied to binary and survival phenotypes. Furthermore, it is possible in future research to extend the PRS-PGx-Bayes to analyzing binary and survival endpoints by adopting the Bayesian logistic regression and Bayesian Cox proportional hazards model instead of the Bayesian linear regression as mentioned in the "Discussion" section.
# 9 Maybe it could be made clearer that the penalized regression methods are actually individual-level data methods. How can these be worse than the C+T method?
Response: Thanks for the suggestion. We agree to make it clearer that the penalized regression methods use individual-level data. We revised the following sentences in the "Introduction" section: "... These methods include PRS-PGx-Unadj (Unadjusted), PRS-PGx-CT (Clumping + Thresholding), PRS-PGx-L, -GL, -SGL (-Lasso, -Group Lasso, -Sparse Group Lasso regression) methods. Our proposed methods use only PGx genome-wide association summary statistics and an external LD reference panel except for the penalized regression based methods, which require access to individual level genetic and phenotypic data. ..." In the "Methods: PRS-PGx-L, -GL and -SGL" section, we also added the following sentences: "... In the PRS-PGx-L, -GL and -SGL methods, penalized regression is applied to estimate equation (1) with individual-level data ..." Regarding the question about the performance comparison between the penalized regression methods and the C+T method, we would like to clarify that, in general, two factors may impact the comparison: 1) the difference in the data that was used; 2) the difference in the methodology.
On the first point, using individual-level data may not necessarily increase the prediction performance in a significant way. Lin and Zeng (2010a, 2010b) showed that using GWAS summary statistics is not worse than the individual-level data as long as both analyses are conducted under the same model and assumptions. To quote directly from Lin and Zeng (2010b), "We show theoretically and numerically that meta-analysis of summary results is statistically as efficient as joint analysis of individual participant data".
On the second point, we would like to point out that although Lasso based methods may be more sophisticated, their performances are not guaranteed to be higher than the simpler methods like C+T. For example, in the paper of Lassosum (Mak et al., 2017), their Figure 2B (the real data analysis results) also shows that for some particular diseases, i.e., Bipolar disorder (BD) and Type 2 diabetes (T2D), C+T could outperform the Lasso based method. We added Figure 2B of Mak et al., (2017) here as the Response Figure 2. In our analysis, there are scenarios where Lasso based methods outperformed the method C+T. For example, Supplementary Figure 5 shows that when the heritability was 0.5, PRS-PGx-GL was much better than PRS-PGx-CT in terms of both prediction accuracy and patient stratification. In Supplementary Figure 7, when the proportional of causal variants was low (i.e., the effect size of each causal variant was large), PRS-PGx-SGL was better than, or at least comparable to, PRS-PGx-CT.
In summary, we believe the C+T and Lasso based methods have their own advantages and disadvantages depending on the heritability and the number of causal variants. Therefore, the performance comparison between C+T and Lasso is ultimately trait-specific in disease PRS; and similarly, the comparison is drug-response-specific in PGx PRS.
In the "Discussion" section, we added the following sentences: "... Interestingly, we find that the C+T method (PRS-PGx-CT) can outperform the penalized regression based methods (PRS-PGx-L, PRS-PGx-GL and PRS-PGx-SGL) in some of the scenarios. This pattern was also observed in Mak et al., (2017). Our results suggest that Lasso based methods are sensitive to the noise, and perform poorly when the signal-to-noise ratio is small (Supplementary Figure 5 and 7). Further study is needed to examine the difference more comprehensively ..." References:   # 10 The code used at https://github.com/cran/PRSPGx/blob/master/R/ PRS_PGx_Bayes.R L69 seems to be assuming that cor(Gi × T, Gj) = cor(Gi, Gj × T), which does not seem to be true.
Response: Thanks for pointing this out. We agree with you that in general cor(G i × T, G j ) = cor(G i , G j × T ). However, we want to clarify that we treat these two terms approximately equal because we can show in theory (added to the Supplementary Method D) that the difference is pretty small and has little impact on the performance of the method we proposed. We used this approximation under the belief that it can help reduce the computational cost. However, as shown in the Response Table 1, the reduction is limited. We will revise our PRS-PGx-Bayes function in the PRSPGx R package in the next CRAN version update.
To evaluate the difference between these two terms, we make the following assumptions. Let f i be the MAF of G i . Under Hardy-Weinberg equilibrium, 14 Response Figure 2: Performance of Lassosum vs. other methods when using real summary statistics data from meta-analyses. Predictive accuracy was assessed by predictio nin the WTCCC dataset after the contribution from WTCCC was removed from the summa rystatistics. (Mak et al., 2017) [Editorial note: this figure was redacted due to third-party rights. It can be found in Mak et al., 2017 [3], Figure 2B] [redacted] We use subscript j and T for G j and T , respectively. σ ij denotes the covariance between G i and G j . Further assume T and G i (or G j ) are independent, we derived (in Supplementary Method D) that: Without loss of generality, assume 0 < f i ≤ f j ≤ 0.5. Follow the deduction in VanLiere and Rosenberg (2008), it can be shown that − By Lagrangian multipliers, we can show that where the maximum is attained at the boundary f j = 0.5, To further investigate how this approximation would affect the predictive performance, we re-ran the simulation following the "Methods: Simulations" section in the main text. Simulation settings remained the same as described in Fig. 2. Four criteria were used to compare the performance of the algorithm with vs without the approximation: R 2 under the whole population, treatment arm, and control arm, as well as the predictive p-value. The comparison results were added into the Supplemental materials document as Supplementary Figure 15. The figure shows that there is very little difference between the algorithm with the approximation and without the approximation.
To summarize what we just discussed, in Supplementary Method D, we added the following paragraph: "... In practice, we approximated cor(G × T, G) ≈ cor(G, G × T) in the top-right block and the bottom-left block of D, which can reduce the computational cost of PRS-PGx-Bayes function. We can prove that such approximation held with the maximal difference smaller than 0.066 ... To further evaluate the impact of such approximation on the final results, we compared predictive performances of PRS-PGx-Bayes with and without approximation in the simulation studies, where simulation settings remained the same as described in Fig. 2 in the main text (i.e., the heritability was fixed at 0.3, the training sample size was 3,000, and ψ/ξ = 1). Supplementary Figure 15 suggested that there was very little difference in terms of predictive performances when the approximation was used ..." We introduced this approximation strategy because we believed it can help reduce the computational cost. The Response Table 1 was generated following the same simulation setting described in the Comment #3. Specifically, we checked the computational time of chr 6, LD block 33 (the largest LD block with 11,769 SNPs) based on one single core. We also explored scenarios by randomly choosing Supplementary Figure 15: Predictive performance of PRS-PGx-Bayes with or without approximation (i.e., cor(G × T, G) ≈ cor(G, G × T)) in the simulation studies, where heritability was fixed at 0.3 and ψ/ξ = 1. The numbers of the causal variants for P(causal) = 0.001, 0.01 and 0.1 were 5, 50 and 500, respectively. The training sample size was 3,000. The error bar represents the standard error of the results calculated from the testing sets. The performance was assessed in the testing set in terms of (a) prediction accuracy R 2 of S P Gx in two arms, (b) p-value for the predictive effect of S pred × T interaction, (c) R 2 of S P Gx under treatment arm, and (d) R 2 of S P Gx under control arm.
1,000, 3,000, 5,000, 7,000, 9,000 SNPs from that block. The computational time was assessed using PRS-PGx-Bayes function with 1,000 MCMC iterations, and the tuning parameters were selected via cross-validation. It turned out that such approximation actually cannot reduce the computational time significantly (only around 1.5% reduction). # 11 I don't know if it is very often the case in RCTs that mu_T = 0.5, but if so, would it be interesting to also derive equations for this special case?

Response
Response: Thanks for your question. Yes, µ T = 0.5 corresponds to a 1-to-1 treatment-placebo allocation ratio, which is the most commonly used ratio in randomized clinical trials. It is also possible that µ T is different from 0.5 such as when the allocation ratio is 2-to-1 or 3-to-1 etc.
We agree that it would be more informative to provide some equations for the special cases. Therefore, we added the following paragraphs to Supplementary Method D: Now we consider several special cases of: .
# 12 I find it odd to denote a covariance (e.g. cov(Gi, Gj)) by σ 2 ij since it can have a negative value.
Response: Thanks for pointing this out. We have replaced σ 2 ij by σ ij in Supplementary Method D accordingly.
# 13 To be sure I understand the code of PRS-PGx-CT, the clumping is performed on the cor(Gi, Gj) while the thresholding is performed on the p-values from the model with the interaction term, and then the same SNPs are used for keeping variants in both types of effects?
Response: Thanks for the question. The clumping is performed on the cor(G i , G j ) while the thresholding is performed on the 2-df test (i.e., joint test of G + G × T) p-values. And then the same SNPs are used for keeping variants in both types of effects.
# 14 Is it really useful to talk about the special case of µ T = 0?
Response: Thanks for your question. We acknowledge that the special case of µ T = 0 is too simplified. Therefore, we now provide the deduction of E[β j | β j ] and E[α j | α j ] for µ T = 0.5 in Supplementary Method E. We deleted the special case of µ T = 0 and updated the "Methods: PRS-PGx-Bayes" section as the following: "... Second, assume a one-to-one treatment-placebo allocation ratio (µ T = 0.5), unlinked genetic markers (σ ij ≡ 0 for i = j) and ρ j ≡ 0 (i.e., within each SNP, the two effect sizes are independent). We can derive the formulas explicitly for E[β j | β j ] and E[α j | α j ] (Supplementary Method E). Under the simplified scenario where all markers' MAFs are small, f j ≡ f → 0, we can show that: as the shrinkage factors. Therefore,β j /t j andα j /s j are the 'shrunk' effects: t j = s j = 1 indicates no shrinkage while t j = s j → ∞ yields full shrinkage. The correlation c between G j and G j × T also contributes to the second part of the numerator because the bias induced by the positive correlation needs to be corrected. ..." # 15 Could be reworded or corrected: + "its genotype main and genotype-by-treatment interaction effects" + "for the simplicity" + "an probability" + for writing expectations, maybe use "E[X]" instead of "EX" + G ∼ Binomial(2; MAF) → G ∼ Binomial(2; AF) + define "GEI" Response: Thanks for pointing these out.
In the "Introduction" section, we revised the following sentence: "... have a constant ratio between its genotype main effect and genotype-by-treatment interaction effect ..." In the "Results: Conceptual framework of the PRS-PGx methods" section, we revised the following sentence: "... For the simplicity ..." "... and g is a probability density function of a random matrix ..." In the Supplementary Method F, we revised the following sentence: In the "Discussion" section, we revised the following sentence: "... Recent studies have shown that Genotype-by-Environment Interaction (GEI or G × E) may explain ..." For the fifth point (G ∼ Binomial(2; MAF) → G ∼ Binomial(2; AF)), we think it might be more accurate to keep "MAF" (Minor Allele Frequency) since genotypes in our simulation studies and real data analysis were coded as the number of minor alleles by assuming an additive genetic model.

Response to Reviewer 2
Zhai et al. propose a method to model PRS for PGx studies by simultaneously modeling both the prognostic and predictive effects. They perform simulation studies and show that their newly developed PRS-PGx methods generally outperform the disease PRS methods with PRS-PGx-Bayes being superior. Application of PRS-PGx methods to a large cardiovascular RCT (IMPROVE-IT) to predict treatment related LDL cholesterol showed noticeable improvements. This study is well-written and provides good data in both simulations and applied to a cardiovascular RCT for improved PRS methods for drug response prediction in PGx studies. These methods are needed, as the authors noted, most PGX PRS studies to date have relied on standard disease PRS methods which relies on stringent assumptions that are not necessarily correct and are restricted in terms of prediction potential and can result in limited power in prediction. By considering a PGx PRS that not only is built from variants with prognostic effects but also predictive effects related to treatment, an optimized PRS can be built to study treatment interaction effects for PRS in PGx studies.
Response: We thank the reviewer for the high-level summary of our research work in this manuscript.

Major comments
# 1 There has been intense focus on extending disease PRS to multi-ethnic populations. Studies have shown that PRS trained on European GWAS data have limited portability and reduced prediction in Non-European populations. The issue of multi-ethnic populations is absent in this study -how do these PGx PRS studies function with respect to trans-ancestry populations?
Response: Thank you for raising this good point. We agree with the reviewer that in the disease PRS area, there have been quite a few publications/preprints addressing this issue. For example, Márquez-Luna et al. (2017) used data from multiple ethnic groups for training and combined the per ethnic group scores by weighted averaging them into a joint score that can increase the prediction performance for non-Caucasian groups. Ruan et al. (2021) presented another methodology by incorporating a similar weighted average score into the Bayesian regression framework and reported the improvement of the prediction of quantitative traits and schizophrenia risk in non-European populations. Therefore, it is reasonable to assume that a dedicated multi-ethnic PGx-PRS method can improve the drug response prediction in non-Caucasian populations.
To clarify that our methods were developed based on single-ethnic population (e.g., European population), in the "Discussion" section, we added the following sentences: "... Moreover, our methods are developed based on single-ethnic population (e.g., European population). A direct application of this score to other ethnic groups may result in considerable loss in prediction accuracy. It is, therefore, of great research interest to extend the proposed PRS-PGx approaches to the trans-ethnic scenario in future ..."

References:
[1] Márquez-Luna, C., Loh, P., South Asian Type 2 Diabetes (SAT2D) Consortium, SIGMA Response: Thank you for your suggestion. We agree with you that it is important to add detailed results about the effect sizes. Per your suggestion, we revised Table 1 to include more information (i.e. βs, SEs). Two arms 1 T arm 2 C arm 3 Two arms T arm C arm Two arms T arm C arm 1 Two-arm model: In the "Results: Polygenic prediction of drug responses in the IMPROVE-IT PGx GWAS study" section, we added the following sentences to describe Table 1: "... Table 1 also shows that the marginal effect sizes β G of S PGx from the model Y ∼ S PGx under treatment arm were all negative across different PRS-Dis and PRS-PGx methods, indicating that a larger PRS would result in more LDL reduction after 1-month treatment of Ezetimibe + Simvastatin. In the meantime, PRS-PGx-Bayes method outperformed the others with the largest absolute value of effect size β G . Similarly, the interaction effect sizes β G×T of S pred from the model Y ∼ T+S prog +T×S pred were all negative across all methods, implying that a larger predictive score would result in a larger treatment effect (i.e., Ezetimibe + Simvastatin combination vs. Simvastatin monotherapy). PRS-PGx-Bayes method is also superior to the others with the largest absolute value of effect size β G×T ..."   # 4 Following on comment above, for disease PRS, studies have reported the top X percentile of PRS vs. rest of distribution and highlight X fold odds ratio which can be informative for risk stratification.
Can a similar logic be used here?
Response: This is a very good point. Per your suggestion, we revised Fig. 4 in the manuscript as follows. Fig. 4 Patient stratification performance of six polygenic prediction methods in the IMPROVE-IT PGx real data analysis. a Quantile plot of treatment effect using four fixed quantiles (0% ∼ 25%, 25% ∼ 50%, 50% ∼ 75% and 75% ∼ 100%). TE stands for "treatment effect", and CI denotes "confidence interval". b Differential treatment effect when patients were stratified into top 10%, 20%, · · · , 90% percentile of the predictive score vs. the rest, respectively.
We also revised the description of Fig. 4 in the "Results: Polygenic prediction of drug responses in the IMPROVE-IT PGx GWAS study" section as follows: "... We further compared the patient stratification performance across different methods with the results summarized in Fig. 4. In Fig. 4a, we used four fixed quantiles (0% ∼ 25%, 25% ∼ 50%, 50% ∼ 75% and 75% ∼ 100%). The results indicated that although overall the population had a positive treatment effect (i.e., Simva+EZ is better), the treatment effects varied across different patient subgroups when stratified by the predictive score. Furthermore, the predictive score determined by PRS-PGx-Bayes was generally superior to other methods for patient stratification. Specifically, ratios of top 75% ∼ 100% subgroup to bottom 0% ∼ 25% subgroup in terms of treatment effects were 1.27, 1.48, 1.65, 3.27, 1.94, and 10.28 for PRS-Dis-CT, PRS-Dis-LDpred2, PRS-PGx-Unadj, PRS-PGx CT, PRS-PGx-GL, and PRS-PGx-Bayes, respectively. In Fig 4b, patients were stratified into top 10%, 20%, · · · , 90% percentile of the predictive score vs. the rest, respectively. The corresponding between group differential treatment effect was calculated. Among the six methods, PRS-PGx-Bayes had the largest differential treatment effect across different cutoff points followed by PRS-PGx-CT and PRS-PGx-GL; and the rest three methods had the lowest differential treatment effects. The optimal cutoff point for PRS-PGx-Bayes occurred between 50% ∼ 60%, with differential treatment effect around 0.52. Instead of using fixed quantiles, we also determined the optimal quantile cutoffs with the largest differential treatment effect estimated from the 5-fold cross-validation (training and testing) procedures. The corresponding ability of PRS-PGx-Bayes to stratify patients with greater clinical benefits was assessed in different validation sets with the results summarized in Supplementary Figure 10. The differences in treatment effects between high and low predictive score subgroups were very clear in the overall population as well as in four out of five CVs ..."

Minor comments
# 5 Avoid superfluous words which appear to overhype the message of the study. For example "paradigm shift" in the Abstract could be toned down.
Response: Thanks for the suggestion. Per your suggestion, we deleted "paradigm" in the Abstract: "... We propose a paradigm shift from disease PRS to PGx PRS approaches by simultaneously modeling both the prognostic and predictive effects and further make this paradigm shift possible by developing a series of PRS-PGx methods, including a novel Bayesian regression approach (PRS-PGx-Bayes) ..." We would like to thank two reviewers for carefully reviewing our response letter and giving us additional comments and suggestions. Their input enabled us to significantly improve the quality of our manuscript, which is highly appreciated. Detailed responses to reviewers are given below. We highlighted the changes (in blue) related to the comments in this revision.
Response to Reviewer 1 # 1 I think the authors have carefully answered most of my comments (Reviewer 1), thank you for your efforts. However, I feel like some of the explanations they give should also be mentioned in the manuscript so that it is useful to other readers as well (e.g. for comments #1, #2, #6, and #13).
Response: Thank you for your comments and suggestions. We are glad to know that the reviewer was satisfied with our responses to most of the comments. We agree with the reviewer that some of the explanations should also be mentioned in the manuscript. Per your suggestion, we mentioned those key explanation points in our responses to the comments #1, #2, #6, and #13 below as well as in the main manuscript.
• Comment #1: explain why we need to adjust for the baseline in the model -We revised the following sentences in the "Methods: IMPROVE-IT PGx GWAS data analysis" section: "... As recommended by Zhang et al. (2022), we adjusted for the baseline LDL-C level Y 0 (in the log scale) in the model to appropriately control the type I error rate (or genome inflation) ..." • Comment #2: briefly summarize the novelties of the PRS-PGx-Bayes method we proposed in terms of the concept, the model, and the technicality -We added the following sentences in the "Discussion" section: "... To our best knowledge, no existing methods can be directly applied to jointly model both prognostic and predictive effects for drug response prediction. The necessity of using PGx PRS approaches instead of disease PRS approaches is validated by the proof of extremely stringent assumptions needed for the disease PRS approach to predict drug response ..." "... Compared with the PRS-Dis methods, the PRS-PGx approaches can shrink variants' main and interaction effect sizes simultaneously, and construct PGx scores including a prognostic PRS and a predictive PRS. Thus, in our PRS-PGx-Bayes method, we propose to accommodate both effects and their correlation by modeling a variance-covariance matrix. Although the inverse Wishart (IW) prior is widely used, in this paper, we choose to use the hierarchical half-t prior (Huang and Wand, 2013) instead due to the limitation of IW (i.e., IW prior imposes a dependency between the correlations and the variances). Moreover ..."